Composite Repetition-Aware Data Structures
نویسندگان
چکیده
In highly repetitive strings, like collections of genomes from the same species, distinct measures of repetition all grow sublinearly in the length of the text, and indexes targeted to such strings typically depend only on one of these measures. We describe two data structures whose size depends on multiple measures of repetition at once, and that provide competitive tradeoffs between the time for counting and reporting all the exact occurrences of a pattern, and the space taken by the structure. The key component of our constructions is the run-length encoded BWT (RLBWT), which takes space proportional to the number of BWT runs: rather than augmenting RLBWT with suffix array samples, we combine it with data structures from LZ77 indexes, which take space proportional to the number of LZ77 factors, and with the compact directed acyclic word graph (CDAWG), which takes space proportional to the number of extensions of maximal repeats. The combination of CDAWG and RLBWT enables also a new representation of the suffix tree, whose size depends again on the number of extensions of maximal repeats, and that is powerful enough to support matching statistics and constant-space traversal.
منابع مشابه
Practical combinations of repetition-aware data structures
Highly-repetitive collections of strings are increasingly being amassed by genome sequencing and genetic variation experiments, as well as by storing all versions of human-generated files, like webpages and source code. Existing indexes for locating all the exact occurrences of a pattern in a highly-repetitive string take advantage of a single measure of repetition. However, multiple, distinct ...
متن کاملA comprehensive review on modeling of nanocomposite materials and structures
This work presents a historical review of the researches procured by various scientists and engineers dealing with the nanocomposite materials and continuous systems manufactured from such materials. Nanocomposites are advanced type of well-known composite materials which have been reinforced with nanosize reinforcing fibers and/or particles. Such materials can be better suit for the industrial...
متن کاملSelective scan slice repetition for simultaneous reduction of test power consumption and test data volume
In this paper, we present a selective scan slice encoding technique for power-aware test data compression. The proposed scheme dramatically reduces test data volume via scan slice repetition, and generates an adjacent-filled test pattern known as the favorable lowpower pattern mapping method. Experiments were performed on the large ITC’99 benchmark circuits, and results show the effectiveness o...
متن کاملComposite Materials with Self-Contained Wireless Sensing Networks
The increasing demand for in-service structural health monitoring, particularly in the aircraft industry, has stimulated efforts to integrate self sensing capabilities into materials and structures. This work presents efforts to develop structural composite materials which include networks of sensors with decision-making capabilities that extend the functionality of the composite materials to b...
متن کاملInvestigation of Nonlinear Behavior of Composite Bracing Structures with Concrete Columns and Steel Beams (RCS) Applying Finite Element Method
The composite structural system (RCS) is a new type of moment frame, which is including a combination of concrete columns (RC) and steel beams (S). These structural systems have the advantages of both concrete and steel frames [1]. In previous research on composite structures, there are some studies regarding RCS composite conections, but there is no investigation about seismic resisting system...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015